منابع مشابه
Rapid Distance-Based Outlier Detection via Sampling
Distance-based approaches to outlier detection are popular in data mining, as they do not require to model the underlying probability distribution, which is particularly challenging for high-dimensional data. We present an empirical comparison of various approaches to distance-based outlier detection across a large number of datasets. We report the surprising observation that a simple, sampling...
متن کاملScalable Simple Random Sampling and Stratified Sampling
Analyzing data sets of billions of records has now become a regular task in many companies and institutions. In the statistical analysis of those massive data sets, sampling generally plays a very important role. In this work, we describe a scalable simple random sampling algorithm, named ScaSRS, which uses probabilistic thresholds to decide on the fly whether to accept, reject, or wait-list an...
متن کاملStratified Median Ranked Set Sampling: Optimum and Proportional Allocations
In this paper, for the Stratified Median Ranked Set Sampling (SMRSS), proposed by Ibrahim et al. (2010), we examine the proportional and optimum sample allocations that are two well-known methods for sample allocation in stratified sampling. We show that the variances of the mean estimators of a symmetric population in SMRSS using optimum and proportional allocations to strata are smaller than ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGARCH Computer Architecture News
سال: 2001
ISSN: 0163-5964
DOI: 10.1145/384285.379273